This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting. It is optimized specifically for low-power processor units like microcontrollers. ML operators exhibit heterogeneous efficiency profiles on power-efficient hardware. Given the exact theoretical computation cost, int8 operators are more computation-effective than float operators, and linear layers are often more efficient than other layers. The proposed LiCo-Net is a dual-phase system that uses the efficient int8 linear operators at the inference phase and applies streaming convolutions at the training phase to maintain a high model capacity. The experimental results show that LiCo-Net outperforms single-value decomposition filter (SVDF) on hardware efficiency with on-par detection performance. Compared to SVDF, LiCo-Net reduces cycles by 40% on HiFi4 DSP.
translated by 谷歌翻译
Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.
translated by 谷歌翻译
The proliferation of unmanned aircraft systems (UAS) has caused airspace regulation authorities to examine the interoperability of these aircraft with collision avoidance systems initially designed for large transport category aircraft. Limitations in the currently mandated TCAS led the Federal Aviation Administration to commission the development of a new solution, the Airborne Collision Avoidance System X (ACAS X), designed to enable a collision avoidance capability for multiple aircraft platforms, including UAS. While prior research explored using deep reinforcement learning algorithms (DRL) for collision avoidance, DRL did not perform as well as existing solutions. This work explores the benefits of using a DRL collision avoidance system whose parameters are tuned using a surrogate optimizer. We show the use of a surrogate optimizer leads to DRL approach that can increase safety and operational viability and support future capability development for UAS collision avoidance.
translated by 谷歌翻译
Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy. Recent works permanently zero these channels during training, which we observe to significantly hamper final accuracy, particularly as the fraction of the network being pruned increases. We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint. By adding a soft mask re-parameterization of the weights and channel pruning from the perspective of removing input channels, we allow gradient updates to previously pruned channels and the opportunity for the channels to later return to the network. We then formulate input channel pruning as a global resource allocation problem. Our method outperforms prior works on both the ImageNet classification and PASCAL VOC detection datasets.
translated by 谷歌翻译
道德框架和情感会影响各种在线和离线行为,包括捐赠,亲环境行动,政治参与,甚至参与暴力抗议活动。自然语言处理中的各种计算方法(NLP)已被用来从文本数据中检测道德情绪,但是为了在此类主观任务中取得更好的性能,需要大量的手工注销训练数据。事实证明,以前对道德情绪注释的语料库已被证明是有价值的,并且在NLP和整个社会科学中都产生了新的见解,但仅限于Twitter。为了促进我们对道德修辞的作用的理解,我们介绍了道德基础Reddit语料库,收集了16,123个reddit评论,这些评论已从12个不同的子雷迪维特策划,由至少三个训练有素的注释者手工注释,用于8种道德情绪(即护理,相称性,平等,纯洁,权威,忠诚,瘦道,隐含/明确的道德)基于更新的道德基础理论(MFT)框架。我们使用一系列方法来为这种新的语料库(例如跨域分类和知识转移)提供基线道德句子分类结果。
translated by 谷歌翻译
在这项工作中,我们为UNET体系结构引入了一个受生物学启发的远程跳过连接,该连接依赖于混合图像的感知幻觉,是同时编码两个图像的图像。早期编码器特征与更深的解码器的融合允许UNET模型产生更细粒度的密集预测。尽管在细分任务中经过证明,但由于这些远程跳过连接还会导致纹理转移伪像,因此网络的好处对于密集的回归任务进行了下降加权。特别是为了深度估计,这损害了光滑度,并引入了假正边,这是由于深度地图的平滑性质而对任务有害的。拟议的Hybridskip连接显示在平衡边缘保存之间的权衡方面的性能得到了改善,以及损害光滑度的纹理转移伪像的最小化。这是通过分别在高频和低频,编码器和解码器特征之间提供的信息的适当和平衡的信息来实现的。
translated by 谷歌翻译
人类可以利用身体互动来教机器人武器。当人类的动力学通过示范引导机器人时,机器人学习了所需的任务。尽管先前的工作重点是机器人学习方式,但对于人类老师来说,了解其机器人正在学习的内容同样重要。视觉显示可以传达此信息;但是,我们假设仅视觉反馈就错过了人与机器人之间的物理联系。在本文中,我们介绍了一类新颖的软触觉显示器,这些显示器包裹在机器人臂上,添加信号而不会影响相互作用。我们首先设计一个气动驱动阵列,该阵列在安装方面保持灵活。然后,我们开发了这种包裹的触觉显示的单一和多维版本,并在心理物理测试和机器人学习过程中探索了人类对渲染信号的看法。我们最终发现,人们以11.4%的韦伯(Weber)分数准确区分单维反馈,并以94.5%的精度确定多维反馈。当物理教授机器人臂时,人类利用单维反馈来提供比视觉反馈更好的演示:我们包装的触觉显示会降低教学时间,同时提高演示质量。这种改进取决于包裹的触觉显示的位置和分布。您可以在此处查看我们的设备和实验的视频:https://youtu.be/ypcmgeqsjdm
translated by 谷歌翻译
在查询图像中检索与感兴趣的对象(OOI)在语义上相似的对象具有许多实际用例。一些示例包括修复失败,例如虚假的负面因素/阳性模型或减轻数据集中的类不平衡。有针对性的选择任务需要从大规模的未标记数据池中找到相关数据。在此规模上进行手动开采是不可行的。此外,OOI通常很小,占据图像区域的1%不到1%,被遮挡,并且在混乱的场景中与许多语义上不同的物体共存。现有的语义图像检索方法通常集中在较大尺寸的地理地标的采矿和/或需要额外的标记数据,例如带有相似对象的图像/图像对,用于带有通用对象的挖掘图像。我们在DNN功能空间中提出了一个匹配算法的快速稳固的模板,该模板从一个大的未标记数据池中检索了对象级的语义相似图像。我们将查询图像中OOI周围的区域投射到DNN功能空间以用作模板。这使我们的方法能够专注于OOI的语义,而无需额外的标记数据。在自主驾驶的背景下,我们通过将对象探测器的故障案例作为OOI评估我们的系统进行靶向选择。我们证明了其在具有2.2m图像的大型未标记数据集上的功效,并在采矿中显示出对具有小型OOI的图像的高回忆。我们将我们的方法与众所周知的语义图像检索方法进行比较,该方法也不需要额外的标记数据。最后,我们证明我们的方法是灵活的,并以一种或多种语义上不同的同时发生的OOI无缝地检索图像。
translated by 谷歌翻译
给定一个较小的培训数据集和学习算法,要达到目标验证或测试性能需要多少数据?这个问题至关重要,在诸如自动驾驶或医学成像之类的应用中,收集数据昂贵且耗时。高估或低估数据需求会带来大量费用,而预算可以避免。关于神经缩放定律的先前工作表明,幂律函数可以符合验证性能曲线并将其推断为较大的数据集大小。我们发现,这并不能立即转化为估计所需数据集大小以满足目标性能的更困难的下游任务。在这项工作中,我们考虑了一系列的计算机视觉任务,并系统地研究了一个概括功能功能的功能家族,以便更好地估算数据需求。最后,我们表明,结合调整的校正因子并在多个回合中收集会显着提高数据估计器的性能。使用我们的准则,从业人员可以准确估算机器学习系统的数据要求,以节省开发时间和数据采集成本。
translated by 谷歌翻译
衡量工作头衔之间的语义相似性是自动工作建议的重要功能。通常使用有监督的学习技术来处理此任务,这需要以同等职位对的形式进行培训数据。在本文中,我们提出了一种使用嘈杂技能标签培训职位相似性模型的无监督表示学习方法。我们表明,对于文本排名和作业归一化等任务非常有效。
translated by 谷歌翻译